Here we’re loading a genotype matrix with 159 individuals and ~30k SNPs. We selected three individuals from each Human Origins population.
# install dependencies
packages <- c("tidyverse", "cowplot", "softImpute", "missMethods", "norm", "mvtnorm", "ggrepel", "plotly", "magrittr")
Map(function(x) { install.packages(x) }, packages[!packages %in% utils::installed.packages()])
## named list()
library(magrittr)
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE, fig.width = 8, fig.height = 6)
#setwd('~/Documents/exp_dat_reading_group_2021/session_4/')
source('helper_functions.R')
# load data
geno_matrix <- scan("geno_matrix_three.txt", what = "character") %>%
strsplit("") %>%
do.call(rbind, .) %>%
apply(., 2, as.numeric)
context_info <- readr::read_csv("context_info_three.csv")
## Rows: 159 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Individual_ID, Group_Name, Country, Makro_Region
## dbl (2): Longitude, Latitude
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
######
Here we plot two PCAs - the first uses all 159 individuals. For the second plot, we remove 9 individuals (plotted with red dots and labels), and then project those individuals onto a PCA generated from the remaining 150 individuals. You can see that there is some shrinkage towards the origin.
## projecting: 1 2 3 4 5 6 7 8 9 done.
First compute the static “background” PCA.
## projecting: 1 2 3 4 5 6 done.
## projecting: 1 2 3 4 5 6 done.
## projecting: 1 2 3 4 5 6 done.
## projecting: 1 2 3 4 5 6 done.
## Downsampling:
## 0.5projecting: 1 2 3 4 5 6 done.
## 0.6projecting: 1 2 3 4 5 6 done.
## 0.7projecting: 1 2 3 4 5 6 done.
## 0.8projecting: 1 2 3 4 5 6 done.
## 0.9projecting: 1 2 3 4 5 6 done.
## 0.91projecting: 1 2 3 4 5 6 done.
## 0.913projecting: 1 2 3 4 5 6 done.
## 0.916projecting: 1 2 3 4 5 6 done.
## 0.919projecting: 1 2 3 4 5 6 done.
## 0.922projecting: 1 2 3 4 5 6 done.
## 0.925projecting: 1 2 3 4 5 6 done.
## 0.928projecting: 1 2 3 4 5 6 done.
## 0.931projecting: 1 2 3 4 5 6 done.
## 0.934projecting: 1 2 3 4 5 6 done.
## 0.937projecting: 1 2 3 4 5 6 done.
## 0.94projecting: 1 2 3 4 5 6 done.
## 0.943projecting: 1 2 3 4 5 6 done.
## 0.946projecting: 1 2 3 4 5 6 done.
## 0.949projecting: 1 2 3 4 5 6 done.
## 0.952projecting: 1 2 3 4 5 6 done.
## 0.955projecting: 1 2 3 4 5 6 done.
## 0.958projecting: 1 2 3 4 5 6 done.
## 0.961projecting: 1 2 3 4 5 6 done.
## 0.964projecting: 1 2 3 4 5 6 done.
## 0.967projecting: 1 2 3 4 5 6 done.
## 0.97projecting: 1 2 3 4 5 6 done.
## 0.973projecting: 1 2 3 4 5 6 done.
## 0.976projecting: 1 2 3 4 5 6 done.
## 0.979projecting: 1 2 3 4 5 6 done.
## 0.982projecting: 1 2 3 4 5 6 done.
## 0.985projecting: 1 2 3 4 5 6 done.
## 0.988projecting: 1 2 3 4 5 6 done.
## 0.991projecting: 1 2 3 4 5 6 done.
## 0.994projecting: 1 2 3 4 5 6 done.
## 0.997projecting: 1 2 3 4 5 6 done.
## Done.
Plot this with an animation, where you can scan through the amount of data removed.
Here we do many iterations of the same amount of downsampling, to better observe the range of effects.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## Downsampling:
## 0.5projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.6projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.7projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.8projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.9projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.95projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.96projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.97projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.98projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.99projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## Done.
It’s possible that our dataset is not very susceptible to downsampling, due to the large amounts of variation present in the populations. Here we subset to just the non-African samples, and repeat some of the above experiments.
## [1] 108 31813
## [1] 108 6
## projecting: 1 2 3 4 5 6 7 8 9 done.
## projecting: 1 2 3 4 5 6 7 8 9 done.
## Downsampling:
## 0.5projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.6projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.7projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.8projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.9projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.91projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.913projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.916projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.919projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.922projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.925projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.928projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.931projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.934projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.937projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.94projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.943projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.946projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.949projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.952projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.955projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.958projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.961projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.964projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.967projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.97projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.973projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.976projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.979projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.982projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.985projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.988projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.991projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.994projecting: 1 2 3 4 5 6 7 8 9 done.
## 0.997projecting: 1 2 3 4 5 6 7 8 9 done.
## Done.
Ignore this
## [1] 0.9949212
## [1] -0.942749
Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.